Complex Annotations with NooJ
نویسنده
چکیده
NooJ associates each text with a Text Annotation Structure, in which each recognized linguistic unit is represented by an annotation. Annotations store the position of the text units to be represented, their length, and linguistic information. NooJ can represent and process complex annotations, such as those that represent units inside word forms, as well as those that are discontinuous. We demonstrate how to use NooJ‟s morphological, lexical, and syntactic tools to formalize and process these complex annotations.
منابع مشابه
Syntactic parsing with NooJ
When parsing a text, NooJ’s parsers store all the annotations that they produce in the Text’s Annotation Structure (TAS). At each level of the various linguistic analyses and the corresponding parser, a given parser may add annotations to, or remove annotations from, the TAS. As annotations are attached to larger and larger sequences of texts, the TAS represents the hierarchical structure of th...
متن کاملNooJ: a Linguistic Annotation System for Corpus Processing
One characteristic of NooJ is that its corpus processing engine uses large-coverage linguistic lexical and syntactic resources. This allows NooJ users to perform sophisticated queries that include any of the available morphological, lexical or syntactic properties. In comparison with INTEX, NooJ uses a new technology (.NET), a new linguistic engine, and was designed with a new range of applicat...
متن کاملFormalisation de l'amazighe standard avec NooJ (Formalization of the standard Amazigh with NooJ) [in French]
Dans cette perspective, et dans le but de développer des outils et des ressources linguistiques, nous avons entrepris de construire un module NooJ pour la langue amazighe standard (Ameur et al., 2004). Le présent article propose une formalisation de la catégorie nom permettant de générer à partir d’une entrée lexicale son genre (masculin, féminin), son nombre (singulier, pluriel), et son état (...
متن کاملMorphological analysis of the standard Amazigh language using NooJ platform (Analyse Automatique de la Morphologie Nominale Amazighe) [in French]
متن کامل
Morphological study of Albanian words, and processing with NooJ
We are developing electronic dictionaries and transducers for the automatic processing of the Albanian Language. We will analyze the words inside a linear segment of text. We will also study the relationship between units of sense and units of form. The composition of words takes different forms in Albanian. We have found that morphemes are frequently concatenated or simply juxtaposed or contra...
متن کامل